HW 3: Drafting Viz
Description
- Which option do you plan to pursue?
I plan to pursue Option 1.
- Restate your questions:
What is the makeup of emissions in the US?
- What are the major sources of pollution?
- What sectors do these belong to?
- Which pollutants are most prominent?
- How do emissions differ by state?
- Explain which variables from your data set(s) you will use to answer your question(s), and how.
To answer my first two questions, I need to group the cleaned dataset by source_description and sum emissions_tons to achieve total emissions by source. Then, I can investigate the different sectors contributing to air pollution by grouping the dataset by eis_sector and summing emissions_tons. However, some of these sectors are broken down even further. I am only interested in the overarching sector; so, to consolidate, I will need to assign the overarching sector with a new variable sector. Then, I can group by sector and sum emissions_tons again for a better idea of emissions by sector.
To answer my third question, I can group the dataset by pollutant_type and sum emissions_tons. This will give me the breakdown of emissions by GHG, HAP, CAP and CAP/HAP. OR I can group the dataset by pollutant and sum emissions_tons. This is what I have created in my mockup, but I will have to brainstorm how to represent the very small pollutants. Perhaps I only visualize the top 5 or 10.
Finally, I will sum emissions_tons by state. However, I think this might be more meaningful if I also normalize by state area. If i join my data with the state.area data in R, I can divide total emissions by area (in square miles) and end up with emissions (tons) per square mile for each state.
- Borrowed visualizations:
Where Do Emissions Come From? I really like how the overarching categories (“Energy”, “Industrial processes”, “Agriculture, etc.” and “other) are further broken down and represented with a hue in the circular bar chart. I would like to borrow this framework to visualize the breakdown of
eis_sectorfor my dataset (if I keep the subcategories).Pollutant Infographic I like how the size of the clouds represents the
pollutant_typevariables. I was thinking of representing “which pollutant types are most prominent” with clouds, so it would be very similar to this but by total emissions (in tons) instead of percentages.
- Hand drawn visualizations
- Mock-up
Expand-code
# load packages
library(tidyverse)
library(here)
library(ggwordcloud)
library(geofacet)
library(paletteer)
library(showtext)
library(patchwork)
# load data
emissions_cleaned <- read_csv(here("data", "emissions_cleaned.csv"))
# get colors
my_colors <- paletteer::paletteer_d("palettetown::gloom")
showtext_auto()
# import google fonts
font_add_google(name = "DM Serif Display", family = "dm_serif") # titles
font_add_google(name = "IBM Plex Sans Condensed", family = "ibm_condensed")
titletext <- "ibm_condensed"
font_add_google(name = "DM Sans", family = "dm_sans") # text
font_add_google(name = "IBM Plex Sans", family = "ibm_plex")
subtext <- "ibm_plex"Sources
# emissions by source (bar chart)
sources <- emissions_cleaned %>%
group_by(source_description) %>%
summarise(emissions_tons = sum(emissions_tons))
# bar chart
stack <- ggplot(sources, aes(source_description, emissions_tons)) +
# choose "smokestack" esc color
geom_col(fill = "#D0D0D0FF") +
# label bars with emissions
geom_text(aes(label = scales::label_comma(accuracy = 1, suffix = " t")(emissions_tons)),
vjust = - 0.5,
size = 6,
family = subtext,
fontface = "bold") +
# set base theme
theme_void() +
# adjust theme
theme(
# add x axis text back
axis.text.x = element_text(family = subtext,
face = "bold",
size = 17,
# move closer to bars
margin = margin(t = -10,
b = 10)),
# extend plot margin at top
plot.margin = margin(t = 20)
)
stackSector
# emissions by sector
sector <- emissions_cleaned %>%
group_by(eis_sector) %>%
summarise(emissions_tons = sum(emissions_tons)) %>%
# combine subsectors
mutate(sector = case_when(
str_detect(eis_sector, "Agriculture") ~ "Agriculture",
str_detect(eis_sector, "Biogenics") ~ "Biogenics",
str_detect(eis_sector, "Bulk Gasoline") ~ "Bulk Gasoline Terminals",
str_detect(eis_sector, "Commercial Cooking") ~ "Commercial Cooking",
str_detect(eis_sector, "Dust") ~ "Dust",
str_detect(eis_sector, "Fires") ~ "Fires",
str_detect(eis_sector, "Fuel Comb") ~ "Fuel Comb",
str_detect(eis_sector, "Gas Stations") ~ "Gas Stations",
str_detect(eis_sector, "Industrial Processes") ~ "Industrial Processes",
str_detect(eis_sector, "Miscellaneous") ~ "Misc",
str_detect(eis_sector, "Mobile") ~ "Mobile",
str_detect(eis_sector, "Solvent") ~ "Solvent",
str_detect(eis_sector, "Waste Disposal") ~ "Waste Disposal"
)) %>%
group_by(sector) %>%
summarise(emissions_tons = sum(emissions_tons)) %>%
mutate(label = paste0(sector, " (", round((emissions_tons/sum(emissions_tons))*100, 0), " %)"))
# donut chart
ggplot(sector, aes(x = 2, y = emissions_tons, fill = label)) +
geom_bar(stat = "identity", width = 1) +
# use polar coordinates
coord_polar(theta = "y", start = 0) +
# set base theme
theme_void() +
# create hole
xlim(0.5, 2.5) +
# set legend
theme(
legend.position = "right",
legend.title = element_text(family = subtext,
size = 15,
face = "bold"),
legend.text = element_text(family = subtext,
size = 10)
) +
labs(fill = "Sector") +
# use gloom palette
scale_fill_paletteer_d("palettetown::gloom")Sector
# remove mobile sector for further analysis
sector %>%
filter(!sector == "Mobile") %>%
# generate new donut chart
ggplot(aes(x = 2, y = emissions_tons, fill = label)) +
geom_bar(stat = "identity", width = 1) +
# use polar coordinates
coord_polar(theta = "y", start = 0) +
# set base theme
theme_void() +
# create hole
xlim(0.5, 2.5) +
# set legend
theme(
legend.position = "right",
legend.title = element_text(family = subtext,
size = 15,
face = "bold"),
legend.text = element_text(family = subtext,
size = 10)
) +
labs(fill = "Sector") +
# use gloom palette
scale_fill_paletteer_d("palettetown::gloom")Sector
# remove mobile and fires
sector %>%
filter(!sector %in% c("Mobile", "Fires")) %>%
# generate new donut chart
ggplot(aes(x = 2, y = emissions_tons, fill = label)) +
geom_bar(stat = "identity", width = 1) +
# use polar coordinates
coord_polar(theta = "y", start = 0) +
# set base theme
theme_void() +
# create hole
xlim(0.5, 2.5) +
# set legend
theme(
legend.position = "right",
legend.title = element_text(family = subtext,
size = 15,
face = "bold"),
legend.text = element_text(family = subtext,
size = 10)
) +
labs(fill = "Sector") +
# use gloom palette
scale_fill_paletteer_d("palettetown::gloom")Code
# emissions by sector
sector_2 <- emissions_cleaned %>%
#filter
group_by(eis_sector, scc_level_2) %>%
summarise(emissions_tons = sum(emissions_tons)) %>%
# combine subsectors
mutate(sector = case_when(
str_detect(eis_sector, "Agriculture") ~ "Agriculture",
str_detect(eis_sector, "Biogenics") ~ "Biogenics",
str_detect(eis_sector, "Bulk Gasoline") ~ "Bulk Gasoline Terminals",
str_detect(eis_sector, "Commercial Cooking") ~ "Commercial Cooking",
str_detect(eis_sector, "Dust") ~ "Dust",
str_detect(eis_sector, "Fires") ~ "Fires",
str_detect(eis_sector, "Fuel Comb") ~ "Fuel Comb",
str_detect(eis_sector, "Gas Stations") ~ "Gas Stations",
str_detect(eis_sector, "Industrial Processes") ~ "Industrial Processes",
str_detect(eis_sector, "Miscellaneous") ~ "Misc",
str_detect(eis_sector, "Mobile") ~ "Mobile",
str_detect(eis_sector, "Solvent") ~ "Solvent",
str_detect(eis_sector, "Waste Disposal") ~ "Waste Disposal"
)) %>%
filter(sector == "Mobile") %>%
group_by(scc_level_2) %>%
summarise(emissions_tons = sum(emissions_tons)) %>%
mutate(label = paste0(scc_level_2, " (", round((emissions_tons/sum(emissions_tons))*100, 0), " %)"))
# expand mobile sector for further analysis
# generate new donut chart
ggplot(sector_2, aes(x = 2, y = emissions_tons, fill = label)) +
geom_bar(stat = "identity", width = 1) +
# use polar coordinates
coord_polar(theta = "y", start = 0) +
# set base theme
theme_void() +
# create hole
xlim(0.5, 2.5) +
# set legend
theme(
legend.position = "right",
legend.title = element_text(family = subtext,
size = 15,
face = "bold"),
legend.text = element_text(family = subtext,
size = 10)
) +
labs(fill = "Sector") +
# use gloom palette
scale_fill_paletteer_d("palettetown::gloom")I’d like to represent the sector categories by hue, as suggested in my second “borrowed viz”. This will take some workshopping.
Pollutant Cloud
# emissions by pollutant
pollutant <- emissions_cleaned %>%
group_by(pollutant) %>%
summarise(emissions_tons = sum(emissions_tons)) %>%
arrange(desc(emissions_tons)) %>%
slice(1:10)
# emissions by pollutant type
pollutant_type <- emissions_cleaned %>%
group_by(pollutant_type) %>%
summarise(emissions_tons = sum(emissions_tons))
# cloud plot
cloud <- ggplot(pollutant, aes(label = pollutant, size = emissions_tons)) +
geom_text_wordcloud(family = titletext) +
#scale_size_area(max_size = 20) +
theme_minimal()
cloudMap
# Convert built-in state.area to a df
state_data <- data.frame(
state = state.name,
abbrv = state.abb,
area = state.area
)
# Merge datasets using left_join
state <- emissions_cleaned %>%
left_join(state_data, by = "state") %>%
group_by(state, abbrv, area) %>%
summarise(total_emissions = sum(emissions_tons)) %>%
mutate(rel_emissions = total_emissions/area) %>%
arrange(desc(rel_emissions)) %>%
mutate(opacity = rel_emissions/5068.261374)
core = "#F87000FF"
accent = "gray20"
ggplot(state) +
# initiate a plot with a rectangles, shading by relative observations (opacity value) ----
geom_rect(aes(xmin = 0, xmax = 1, ymin = 0, ymax = 1, alpha = opacity),
fill = core) +
# label with state abbreviation ----
geom_text(aes(x = 0.5, y = 0.7, label = abbrv),
size = 8,
family = subtext,
color = "black") +
# label with observations ----
geom_text(aes(x = 0.5, y = 0.3, label = round(rel_emissions, 0)),
size = 5,
family = subtext,
color = "black") +
# break rectangle up by state ----
geofacet::facet_geo(~state) +
# make each rectangle the same size ----
coord_fixed(ratio = 1) +
# add descriptio line as subtitle ----
labs(title = "Emissions by State",
subtitle = "Tons per Square Mile",
caption = "Data Source: EPA National Emissions Inventory 2020") +
# apply a completely empty theme ----
theme_void() +
# further customize theme ----
theme(
# remove headers from faceted plots ----
strip.text = element_blank(),
# adjust the font and color of the title ----
plot.title = element_text(family = titletext,
face = "bold",
size = 30,
hjust = 0.5,
margin = margin(t = 10,
b = 10)),
# adjust the font and color of the title ----
plot.subtitle = element_text(family = subtext,
size = 20,
hjust = 0.5,
margin = margin(b = 10)),
# remove legend ----
legend.position = "none",
plot.margin = margin(b = 10)
)Extra
# emissions by sector and pollutant type
sector_breakdown <- emissions_cleaned %>%
group_by(eis_sector, pollutant_type) %>%
summarise(emissions_tons = sum(emissions_tons)) %>%
mutate(sector = case_when(
str_detect(eis_sector, "Agriculture") ~ "Agriculture",
str_detect(eis_sector, "Biogenics") ~ "Biogenics",
str_detect(eis_sector, "Bulk Gasoline") ~ "Bulk Gasoline Terminals",
str_detect(eis_sector, "Commercial Cooking") ~ "Commercial Cooking",
str_detect(eis_sector, "Dust") ~ "Dust",
str_detect(eis_sector, "Fires") ~ "Fires",
str_detect(eis_sector, "Fuel Comb") ~ "Fuel Comb",
str_detect(eis_sector, "Gas Stations") ~ "Gas Stations",
str_detect(eis_sector, "Industrial Processes") ~ "Industrial Processes",
str_detect(eis_sector, "Miscellaneous") ~ "Misc",
str_detect(eis_sector, "Mobile") ~ "Mobile",
str_detect(eis_sector, "Solvent") ~ "Solvent",
str_detect(eis_sector, "Waste Disposal") ~ "Waste Disposal"
)) %>%
group_by(sector, pollutant_type) %>%
summarise(emissions_tons = sum(emissions_tons))Decision points
Ultimately, I decided that including subgroups in the donut charts would be too overwhelming. Perhaps I can use the breakdown by scc_level_2 of the mobile sector (the majority sector) instead.
Additionally, instead of including “cloud” sizes by total emissions, I decided to make a word cloud of pollutants, representing emissions in tons by text size.
Should I normalize emissions by population size instead of state area?